Overview

Dataset statistics

Number of variables18
Number of observations132
Missing cells81
Missing cells (%)3.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory18.7 KiB
Average record size in memory145.0 B

Variable types

NUM11
CAT6
BOOL1

Warnings

Start_Lat has a high cardinality: 72 distinct values High cardinality
End_Lon is highly correlated with Start_Lon and 2 other fieldsHigh correlation
Start_Lon is highly correlated with End_Lon and 2 other fieldsHigh correlation
End_Lat is highly correlated with Start_Lon and 2 other fieldsHigh correlation
mta_tax is highly correlated with Start_Lon and 2 other fieldsHigh correlation
Total_Amt is highly correlated with Fare_AmtHigh correlation
Fare_Amt is highly correlated with Total_AmtHigh correlation
Rate_Code has 12 (9.1%) missing values Missing
store_and_forward has 33 (25.0%) missing values Missing
mta_tax has 10 (7.6%) missing values Missing
Total_Amt has 26 (19.7%) missing values Missing
Trip_Pickup_DateTime has unique values Unique
Trip_Dropoff_DateTime has unique values Unique
Trip_Distance has 22 (16.7%) zeros Zeros
Start_Lon has 2 (1.5%) zeros Zeros
End_Lon has 2 (1.5%) zeros Zeros
End_Lat has 2 (1.5%) zeros Zeros
surcharge has 70 (53.0%) zeros Zeros
Tip_Amt has 41 (31.1%) zeros Zeros
Tolls_Amt has 69 (52.3%) zeros Zeros
Total_Amt has 18 (13.6%) zeros Zeros

Reproduction

Analysis started2020-12-31 01:51:49.509714
Analysis finished2020-12-31 01:52:09.295094
Duration19.79 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

vendor_name
Categorical

Distinct5
Distinct (%)3.8%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
1
53 
CMT
39 
VTS
32 
2
DDS
 
1
ValueCountFrequency (%) 
15340.2%
 
CMT3929.5%
 
VTS3224.2%
 
275.3%
 
DDS10.8%
 
2020-12-30T20:52:09.398252image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique1 ?
Unique (%)0.8%
2020-12-30T20:52:09.488017image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:09.617628image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length3
Median length3
Mean length2.090909091
Min length1

Trip_Pickup_DateTime
Categorical

UNIQUE

Distinct132
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
2012-09-01 05:35:00
 
1
2013-03-01 00:00:04
 
1
2017-04-01 00:00:00
 
1
2017-02-03 02:03:50
 
1
2014-08-16 14:58:49
 
1
Other values (127)
127 
ValueCountFrequency (%) 
2012-09-01 05:35:0010.8%
 
2013-03-01 00:00:0410.8%
 
2017-04-01 00:00:0010.8%
 
2017-02-03 02:03:5010.8%
 
2014-08-16 14:58:4910.8%
 
2009-04-08 12:19:0010.8%
 
2019-03-01 00:24:4110.8%
 
2018-10-01 00:23:3410.8%
 
2014-01-09 20:45:2510.8%
 
2013-08-26 15:33:2210.8%
 
Other values (122)12292.4%
 
2020-12-30T20:52:09.763238image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique132 ?
Unique (%)100.0%
2020-12-30T20:52:09.889901image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length19
Median length19
Mean length19
Min length19

Trip_Dropoff_DateTime
Categorical

UNIQUE

Distinct132
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
2017-03-09 21:44:20
 
1
2009-10-26 13:17:00
 
1
2019-10-01 00:55:17
 
1
2019-09-01 00:25:46
 
1
2019-05-01 00:37:27
 
1
Other values (127)
127 
ValueCountFrequency (%) 
2017-03-09 21:44:2010.8%
 
2009-10-26 13:17:0010.8%
 
2019-10-01 00:55:1710.8%
 
2019-09-01 00:25:4610.8%
 
2019-05-01 00:37:2710.8%
 
2017-09-01 00:18:4910.8%
 
2013-05-01 00:12:0010.8%
 
2019-09-01 00:57:5410.8%
 
2018-01-01 00:24:2310.8%
 
2011-04-29 05:55:0010.8%
 
Other values (122)12292.4%
 
2020-12-30T20:52:10.021548image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique132 ?
Unique (%)100.0%
2020-12-30T20:52:10.158182image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length19
Median length19
Mean length19
Min length19

Passenger_Count
Real number (ℝ≥0)

Distinct6
Distinct (%)4.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.318181818
Minimum1
Maximum6
Zeros0
Zeros (%)0.0%
Memory size1.0 KiB
2020-12-30T20:52:10.246945image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q31
95-th percentile3.45
Maximum6
Range5
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.9596292368
Coefficient of variation (CV)0.7279945934
Kurtosis12.33705966
Mean1.318181818
Median Absolute Deviation (MAD)0
Skewness3.531159149
Sum174
Variance0.920888272
MonotocityNot monotonic
2020-12-30T20:52:10.344683image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%) 
111385.6%
 
2107.6%
 
532.3%
 
621.5%
 
421.5%
 
321.5%
 
ValueCountFrequency (%) 
111385.6%
 
2107.6%
 
321.5%
 
421.5%
 
532.3%
 
ValueCountFrequency (%) 
621.5%
 
532.3%
 
421.5%
 
321.5%
 
2107.6%
 

Trip_Distance
Real number (ℝ≥0)

ZEROS

Distinct78
Distinct (%)59.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.239469697
Minimum0
Maximum20
Zeros22
Zeros (%)16.7%
Memory size1.0 KiB
2020-12-30T20:52:10.467381image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.4975
median1.42
Q32.7575
95-th percentile5.87
Maximum20
Range20
Interquartile range (IQR)2.26

Descriptive statistics

Standard deviation2.988884577
Coefficient of variation (CV)1.334639438
Kurtosis15.41844051
Mean2.239469697
Median Absolute Deviation (MAD)1.1
Skewness3.447411751
Sum295.61
Variance8.933431014
MonotocityNot monotonic
2020-12-30T20:52:10.841530image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
02216.7%
 
1.264.5%
 
243.0%
 
0.443.0%
 
0.532.3%
 
1.132.3%
 
2.532.3%
 
1.532.3%
 
2.632.3%
 
0.832.3%
 
Other values (68)7859.1%
 
ValueCountFrequency (%) 
02216.7%
 
0.1110.8%
 
0.1710.8%
 
0.321.5%
 
0.3910.8%
 
ValueCountFrequency (%) 
2010.8%
 
17.5210.8%
 
14.310.8%
 
10.2310.8%
 
9.810.8%
 

Start_Lon
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct72
Distinct (%)54.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-38.77506682
Minimum-74.01405
Maximum1
Zeros2
Zeros (%)1.5%
Memory size1.0 KiB
2020-12-30T20:52:10.990129image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum-74.01405
5-th percentile-74.00329105
Q1-73.9818405
median-73.943541
Q31
95-th percentile1
Maximum1
Range75.01405
Interquartile range (IQR)74.9818405

Descriptive statistics

Standard deviation37.5456034
Coefficient of variation (CV)-0.9682924228
Kurtosis-2.015634718
Mean-38.77506682
Median Absolute Deviation (MAD)0.0677955
Skewness0.1228656569
Sum-5118.308821
Variance1409.672334
MonotocityNot monotonic
2020-12-30T20:52:11.120793image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
16045.5%
 
021.5%
 
-73.97775310.8%
 
-73.9963510.8%
 
-73.99993910.8%
 
-73.99564210.8%
 
-73.97735710.8%
 
-73.97132610.8%
 
-73.9871810.8%
 
-73.99243810.8%
 
Other values (62)6247.0%
 
ValueCountFrequency (%) 
-74.0140510.8%
 
-74.01322710.8%
 
-74.00944610.8%
 
-74.00754910.8%
 
-74.00586710.8%
 
ValueCountFrequency (%) 
16045.5%
 
021.5%
 
-73.776710.8%
 
-73.78744210.8%
 
-73.93479810.8%
 

Start_Lat
Categorical

HIGH CARDINALITY

Distinct72
Distinct (%)54.5%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
N
60 
0
 
2
40.779775999999998
 
1
40.763660999999999
 
1
40.750458000000002
 
1
Other values (67)
67 
ValueCountFrequency (%) 
N6045.5%
 
021.5%
 
40.77977599999999810.8%
 
40.76366099999999910.8%
 
40.75045800000000210.8%
 
40.71170699999999710.8%
 
40.72908410.8%
 
40.73756999999999810.8%
 
40.75797699999999710.8%
 
40.73859699999999910.8%
 
Other values (62)6247.0%
 
2020-12-30T20:52:11.286346image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique70 ?
Unique (%)53.0%
2020-12-30T20:52:11.421988image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length18
Median length9
Mean length9.310606061
Min length1

Rate_Code
Real number (ℝ≥0)

MISSING

Distinct30
Distinct (%)25.0%
Missing12
Missing (%)9.1%
Infinite0
Infinite (%)0.0%
Mean80.43333333
Minimum1
Maximum264
Zeros0
Zeros (%)0.0%
Memory size1.0 KiB
2020-12-30T20:52:11.534641image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median21.5
Q3145
95-th percentile239
Maximum264
Range263
Interquartile range (IQR)144

Descriptive statistics

Standard deviation90.39267589
Coefficient of variation (CV)1.123821084
Kurtosis-1.169811989
Mean80.43333333
Median Absolute Deviation (MAD)20.5
Skewness0.5796280224
Sum9652
Variance8170.835854
MonotocityNot monotonic
2020-12-30T20:52:11.657353image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%) 
15944.7%
 
1451712.9%
 
16143.0%
 
23932.3%
 
23032.3%
 
15121.5%
 
19321.5%
 
9521.5%
 
4821.5%
 
23821.5%
 
Other values (20)2418.2%
 
(Missing)129.1%
 
ValueCountFrequency (%) 
15944.7%
 
210.8%
 
4121.5%
 
4510.8%
 
4821.5%
 
ValueCountFrequency (%) 
26410.8%
 
26310.8%
 
26210.8%
 
24910.8%
 
23932.3%
 

store_and_forward
Categorical

MISSING

Distinct37
Distinct (%)37.4%
Missing33
Missing (%)25.0%
Memory size1.0 KiB
N
32 
145
16 
0
161
 
3
239
 
3
Other values (32)
39 
ValueCountFrequency (%) 
N3224.2%
 
1451612.1%
 
064.5%
 
16132.3%
 
23932.3%
 
721.5%
 
2421.5%
 
19321.5%
 
24621.5%
 
23421.5%
 
Other values (27)2922.0%
 
(Missing)3325.0%
 
2020-12-30T20:52:11.793991image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique25 ?
Unique (%)25.3%
2020-12-30T20:52:11.932615image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length3
Median length3
Mean length2.303030303
Min length1

End_Lon
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct74
Distinct (%)56.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-38.51594494
Minimum-74.015534
Maximum3
Zeros2
Zeros (%)1.5%
Memory size1.0 KiB
2020-12-30T20:52:12.063579image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum-74.015534
5-th percentile-73.9958961
Q1-73.9808765
median-73.93310747
Q31.25
95-th percentile2
Maximum3
Range77.015534
Interquartile range (IQR)75.2308765

Descriptive statistics

Standard deviation37.82084394
Coefficient of variation (CV)-0.981952903
Kurtosis-2.015122876
Mean-38.51594494
Median Absolute Deviation (MAD)0.07363402581
Skewness0.123182447
Sum-5084.104732
Variance1430.416237
MonotocityNot monotonic
2020-12-30T20:52:12.194221image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
23224.2%
 
12720.5%
 
021.5%
 
-73.97624710.8%
 
-73.98639710.8%
 
-73.994610.8%
 
-73.98349210.8%
 
-73.97360210.8%
 
-73.95711210.8%
 
-73.97619310.8%
 
Other values (64)6448.5%
 
ValueCountFrequency (%) 
-74.01553410.8%
 
-74.01046410.8%
 
-74.0099910.8%
 
-74.00349310.8%
 
-73.99941910.8%
 
ValueCountFrequency (%) 
310.8%
 
23224.2%
 
12720.5%
 
021.5%
 
-73.77630810.8%
 

End_Lat
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct100
Distinct (%)75.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean25.50447503
Minimum0
Maximum40.811692
Zeros2
Zeros (%)1.5%
Memory size1.0 KiB
2020-12-30T20:52:12.339834image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2.5
Q16.75
median40.6932995
Q340.7572485
95-th percentile40.78042395
Maximum40.811692
Range40.811692
Interquartile range (IQR)34.0072485

Descriptive statistics

Standard deviation16.90725235
Coefficient of variation (CV)0.6629131683
Kurtosis-1.771954185
Mean25.50447503
Median Absolute Deviation (MAD)0.0963895
Skewness-0.3110408982
Sum3366.590704
Variance285.855182
MonotocityNot monotonic
2020-12-30T20:52:12.475472image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
2.51410.6%
 
353.8%
 
3.543.0%
 
1432.3%
 
4.532.3%
 
021.5%
 
5.521.5%
 
1321.5%
 
421.5%
 
7.521.5%
 
Other values (90)9370.5%
 
ValueCountFrequency (%) 
021.5%
 
2.51410.6%
 
353.8%
 
3.543.0%
 
421.5%
 
ValueCountFrequency (%) 
40.81169210.8%
 
40.808410.8%
 
40.79205810.8%
 
40.7873210.8%
 
40.78564710.8%
 

Payment_Type
Categorical

Distinct12
Distinct (%)9.1%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
0.5
41 
CRD
27 
CSH
26 
3
15 
CASH
Other values (7)
16 
ValueCountFrequency (%) 
0.54131.1%
 
CRD2720.5%
 
CSH2619.7%
 
31511.4%
 
CASH75.3%
 
Credit43.0%
 
Cas32.3%
 
CAS32.3%
 
121.5%
 
021.5%
 
Other values (2)21.5%
 
2020-12-30T20:52:12.629063image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique2 ?
Unique (%)1.5%
2020-12-30T20:52:12.751732image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length6
Median length3
Mean length2.863636364
Min length1

Fare_Amt
Real number (ℝ≥0)

HIGH CORRELATION

Distinct39
Distinct (%)29.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.872727273
Minimum0.5
Maximum45
Zeros0
Zeros (%)0.0%
Memory size1.0 KiB
2020-12-30T20:52:12.863433image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0.5
5-th percentile0.5
Q10.5
median4.1
Q38.2
95-th percentile16.9
Maximum45
Range44.5
Interquartile range (IQR)7.7

Descriptive statistics

Standard deviation7.910203006
Coefficient of variation (CV)1.346938592
Kurtosis10.16702135
Mean5.872727273
Median Absolute Deviation (MAD)3.6
Skewness2.776624281
Sum775.2
Variance62.57131159
MonotocityNot monotonic
2020-12-30T20:52:12.977129image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=39)
ValueCountFrequency (%) 
0.56045.5%
 
6.964.5%
 
9.343.0%
 
4.543.0%
 
12.143.0%
 
6.543.0%
 
632.3%
 
4.132.3%
 
4.932.3%
 
4521.5%
 
Other values (29)3929.5%
 
ValueCountFrequency (%) 
0.56045.5%
 
2.521.5%
 
2.910.8%
 
3.310.8%
 
3.710.8%
 
ValueCountFrequency (%) 
4521.5%
 
39.510.8%
 
26.521.5%
 
1821.5%
 
1610.8%
 

surcharge
Real number (ℝ≥0)

ZEROS

Distinct19
Distinct (%)14.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.5828030303
Minimum0
Maximum6.3
Zeros70
Zeros (%)53.0%
Memory size1.0 KiB
2020-12-30T20:52:13.089835image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30.5
95-th percentile2.4225
Maximum6.3
Range6.3
Interquartile range (IQR)0.5

Descriptive statistics

Standard deviation0.9905032373
Coefficient of variation (CV)1.699550596
Kurtosis10.13071067
Mean0.5828030303
Median Absolute Deviation (MAD)0
Skewness2.807869629
Sum76.93
Variance0.9810966632
MonotocityNot monotonic
2020-12-30T20:52:13.188564image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=19)
ValueCountFrequency (%) 
07053.0%
 
0.53022.7%
 
1107.6%
 
253.8%
 
1.521.5%
 
421.5%
 
3.0610.8%
 
2.7510.8%
 
1.4710.8%
 
2.410.8%
 
Other values (9)96.8%
 
ValueCountFrequency (%) 
07053.0%
 
0.53022.7%
 
0.710.8%
 
1107.6%
 
1.1410.8%
 
ValueCountFrequency (%) 
6.310.8%
 
421.5%
 
3.9510.8%
 
3.0610.8%
 
2.7510.8%
 

mta_tax
Boolean

HIGH CORRELATION
MISSING

Distinct2
Distinct (%)1.6%
Missing10
Missing (%)7.6%
Memory size1.0 KiB
0.5
62 
0
60 
(Missing)
10 
ValueCountFrequency (%) 
0.56247.0%
 
06045.5%
 
(Missing)107.6%
 
2020-12-30T20:52:13.274341image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Tip_Amt
Real number (ℝ≥0)

ZEROS

Distinct21
Distinct (%)15.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.7362878788
Minimum0
Maximum10.1
Zeros41
Zeros (%)31.1%
Memory size1.0 KiB
2020-12-30T20:52:13.351139image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0.3
Q30.3
95-th percentile2.725
Maximum10.1
Range10.1
Interquartile range (IQR)0.3

Descriptive statistics

Standard deviation1.466532705
Coefficient of variation (CV)1.991792542
Kurtosis21.02805066
Mean0.7362878788
Median Absolute Deviation (MAD)0.3
Skewness4.173127027
Sum97.19
Variance2.150718176
MonotocityNot monotonic
2020-12-30T20:52:13.458756image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=21)
ValueCountFrequency (%) 
0.36045.5%
 
04131.1%
 
175.3%
 
232.3%
 
2.1221.5%
 
321.5%
 
2.421.5%
 
2.521.5%
 
1.310.8%
 
910.8%
 
Other values (11)118.3%
 
ValueCountFrequency (%) 
04131.1%
 
0.36045.5%
 
175.3%
 
1.1810.8%
 
1.310.8%
 
ValueCountFrequency (%) 
10.110.8%
 
910.8%
 
6.7510.8%
 
3.810.8%
 
3.710.8%
 

Tolls_Amt
Real number (ℝ≥0)

ZEROS

Distinct40
Distinct (%)30.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.31
Minimum0
Maximum33.3
Zeros69
Zeros (%)52.3%
Memory size1.0 KiB
2020-12-30T20:52:13.574441image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q38.1875
95-th percentile22.475
Maximum33.3
Range33.3
Interquartile range (IQR)8.1875

Descriptive statistics

Standard deviation7.813808388
Coefficient of variation (CV)1.471527003
Kurtosis2.020590915
Mean5.31
Median Absolute Deviation (MAD)0
Skewness1.640432595
Sum700.92
Variance61.05560153
MonotocityNot monotonic
2020-12-30T20:52:13.693177image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=40)
ValueCountFrequency (%) 
06952.3%
 
3.8129.1%
 
4.353.8%
 
4.832.3%
 
8.321.5%
 
12.321.5%
 
4.5721.5%
 
16.321.5%
 
20.321.5%
 
8.821.5%
 
Other values (30)3123.5%
 
ValueCountFrequency (%) 
06952.3%
 
3.8129.1%
 
4.1510.8%
 
4.353.8%
 
4.5721.5%
 
ValueCountFrequency (%) 
33.310.8%
 
31.610.8%
 
27.810.8%
 
26.310.8%
 
25.310.8%
 

Total_Amt
Real number (ℝ≥0)

HIGH CORRELATION
MISSING
ZEROS

Distinct62
Distinct (%)58.5%
Missing26
Missing (%)19.7%
Infinite0
Infinite (%)0.0%
Mean8.784716981
Minimum0
Maximum58.15
Zeros18
Zeros (%)13.6%
Memory size1.0 KiB
2020-12-30T20:52:13.826819image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q12.5
median7
Q311.595
95-th percentile22.65
Maximum58.15
Range58.15
Interquartile range (IQR)9.095

Descriptive statistics

Standard deviation10.0105085
Coefficient of variation (CV)1.139536825
Kurtosis9.762040402
Mean8.784716981
Median Absolute Deviation (MAD)4.5
Skewness2.752763239
Sum931.18
Variance100.2102804
MonotocityNot monotonic
2020-12-30T20:52:13.953480image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
01813.6%
 
2.51612.1%
 
732.3%
 
6.921.5%
 
10.6221.5%
 
8.121.5%
 
8.321.5%
 
821.5%
 
1521.5%
 
9.821.5%
 
Other values (52)5541.7%
 
(Missing)2619.7%
 
ValueCountFrequency (%) 
01813.6%
 
2.51612.1%
 
310.8%
 
3.410.8%
 
3.510.8%
 
ValueCountFrequency (%) 
58.1510.8%
 
50.610.8%
 
50.0710.8%
 
33.7510.8%
 
31.0710.8%
 

Interactions

2020-12-30T20:51:52.856664image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:51:53.015290image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:51:53.174855image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:51:53.322656image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:51:53.448440image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:51:53.579086image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:51:53.709743image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:51:53.830882image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:51:53.961600image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:51:54.081280image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:51:54.206994image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:51:54.332617image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:51:54.472244image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:51:54.629825image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:51:54.759475image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:51:54.877201image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:51:55.007844image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:51:56.183775image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:51:56.328389image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:51:56.461034image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:51:56.577720image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:51:56.703384image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:51:56.821069image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:51:56.948728image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:51:57.073395image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:51:57.194409image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:51:57.308107image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:51:57.428833image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:51:57.551506image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:51:57.658222image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:51:57.772916image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:51:57.893587image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:51:58.009292image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:51:58.120985image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:51:58.236673image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:51:58.352366image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:51:58.462409image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:51:58.563136image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:51:58.673086image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:51:58.784497image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:51:58.882248image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:51:58.984974image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:51:59.087742image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:51:59.193070image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:51:59.294792image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:51:59.421782image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:51:59.547445image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:51:59.667257image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:51:59.778958image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:51:59.898425image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:00.154455image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:00.273134image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:00.389832image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:00.500533image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:00.615227image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:00.726928image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:00.872532image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:01.005184image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:01.133839image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:01.250813image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:01.382450image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:01.511539image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:01.626195image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:01.748881image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:01.862591image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:01.989253image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:02.105981image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:02.215463image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:02.327160image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:02.432887image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:02.529628image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:02.634823image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:02.742535image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:02.839229image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:02.947944image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:03.048712image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:03.148458image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:03.245196image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:03.364870image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:03.484554image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:03.599248image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:03.704965image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:03.820657image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:03.936175image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:04.037902image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:04.146052image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:04.251769image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:04.361429image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:04.473130image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:04.587823image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:04.701519image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:04.986757image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:05.108479image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:05.221185image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:05.333650image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:05.433390image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:05.536122image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:05.634844image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:05.738567image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:05.852216image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:05.988850image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:06.111569image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:06.229209image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:06.340909image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:06.454605image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:06.571293image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:06.674337image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:06.782048image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:06.886772image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:06.995483image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:07.101196image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:07.215887image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:07.330582image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:07.441292image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:07.542015image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:07.651722image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:07.763423image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:07.868139image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:07.973872image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:08.073609image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:08.178322image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2020-12-30T20:52:14.083956image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2020-12-30T20:52:14.310309image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2020-12-30T20:52:14.549728image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2020-12-30T20:52:14.777326image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2020-12-30T20:52:15.008310image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2020-12-30T20:52:08.397736image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:08.751789image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:08.968012image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:09.126545image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Sample

First rows

vendor_nameTrip_Pickup_DateTimeTrip_Dropoff_DateTimePassenger_CountTrip_DistanceStart_LonStart_LatRate_Codestore_and_forwardEnd_LonEnd_LatPayment_TypeFare_Amtsurchargemta_taxTip_AmtTolls_AmtTotal_Amt
0VTS2009-01-04 02:52:002009-01-04 03:02:0012.63-73.99195740.721567NaNNaN-73.99380340.695922CASH8.90.5NaN0.00.009.40
1DDS2009-02-03 08:25:002009-02-03 08:33:3911.60-73.99276840.758324999999999NaNNaN-73.99471040.739723CASH6.90.0NaN0.00.006.90
2CMT2009-03-26 15:30:142009-03-26 15:33:4510.30-73.97070940.796382000000001NaN0-73.97360240.792058Cash4.10.0NaN0.00.004.10
3VTS2009-04-08 12:19:002009-04-08 12:24:0010.49-73.97446740.760793NaNNaN-73.96677040.757057CASH4.10.0NaN0.00.004.10
4CMT2009-05-27 07:41:052009-05-27 07:42:2810.30-73.97410540.742891999999998NaN0-73.97376940.746405Credit3.30.0NaN1.00.004.30
5VTS2009-06-14 23:23:002009-06-14 23:48:00117.52-73.78744240.641525000000001NaNNaN-73.98007240.742963Credit45.00.0NaN9.04.1558.15
6VTS2009-07-15 17:39:002009-07-15 17:46:0011.32-73.99913240.726542000000002NaNNaN-73.98490740.736347Credit6.11.0NaN1.00.008.10
7VTS2009-08-12 07:28:002009-08-12 07:36:0011.800.0000000NaNNaN0.0000000.000000CASH6.90.0NaN0.00.006.90
8VTS2009-09-24 09:00:002009-09-24 09:29:00110.23-73.97896840.766173000000002NaNNaN-73.87226840.774530CASH26.50.0NaN0.04.5731.07
9VTS2009-10-26 13:06:002009-10-26 13:17:0052.02-73.96724240.803224999999998NaNNaN-73.95751740.783533CASH8.10.0NaN0.00.008.10

Last rows

vendor_nameTrip_Pickup_DateTimeTrip_Dropoff_DateTimePassenger_CountTrip_DistanceStart_LonStart_LatRate_Codestore_and_forwardEnd_LonEnd_LatPayment_TypeFare_Amtsurchargemta_taxTip_AmtTolls_AmtTotal_Amt
12212020-02-01 00:17:352020-02-01 00:30:3212.61.0N145.071.011.00.50.52.450.00.314.750.0
12312020-02-01 00:32:472020-02-01 01:05:3614.81.0N45.0611.021.530.56.300.00.331.602.5
12412020-03-01 00:31:132020-03-01 01:01:4214.71.0N88.02551.022.030.52.000.00.327.802.5
12522020-03-01 00:08:222020-03-01 00:08:4910.01.0N193.01932.02.50.50.50.000.00.33.800.0
12612020-04-01 00:41:222020-04-01 01:01:5311.21.0N41.0242.05.50.50.50.000.00.36.800.0
12712020-04-01 00:56:002020-04-01 01:09:2513.41.0N95.01971.012.50.50.52.750.00.316.550.0
12812020-05-01 00:02:282020-05-01 00:18:0710.01.0N234.02561.012.230.52.400.00.318.402.5
12912020-05-01 00:23:212020-05-01 00:26:0120.41.0N264.02641.04.00.50.50.500.00.35.800.0
13012020-06-01 00:31:232020-06-01 00:49:5813.61.0N140.0681.015.530.54.000.00.323.302.5
13112020-06-01 00:42:502020-06-01 01:04:3315.61.0N79.02261.019.530.52.000.00.325.302.5